Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical. Previous studies employ multiple instance learning (MIL) to represent WSIs as bags of sampled patches because, for most occasions, only slide-level labels are available, and only a tiny region of the WSI is disease-positive area. However, WSI representation learning still remains an open problem due to: (1) patch sampling on a higher resolution may be incapable of depicting microenvironment information such as the relative position between the tumor cells and surrounding tissues, while patches at lower resolution lose the fine-grained detail; (2) extracting patches from giant WSI results in large bag size, which tremendously increases the computational cost. To solve the problems, this paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes. Precisely, we randomly extract instant-level patch features from WSIs with different magnification. Then a co-attention mapping between imaging and genomics is learned to uncover the pairwise interaction and reduce the space complexity of imaging features. Such early fusion makes it computationally feasible to use MIL Transformer for the survival prediction task. Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability. We evaluate our approach on five cancer types from the Cancer Genome Atlas database and achieved an average c-index of $0.673$, outperforming the state-of-the-art multimodality methods.
translated by 谷歌翻译
近年来,在应用预训练的语言模型(例如Bert)上,取得了巨大进展,以获取信息检索(IR)任务。在网页中通常使用的超链接已被利用用于设计预训练目标。例如,超链接的锚文本已用于模拟查询,从而构建了巨大的查询文档对以进行预训练。但是,作为跨越两个网页的桥梁,尚未完全探索超链接的潜力。在这项工作中,我们专注于建模通过超链接连接的两个文档之间的关系,并为临时检索设计一个新的预训练目标。具体而言,我们将文档之间的关系分为四组:无链接,单向链接,对称链接和最相关的对称链接。通过比较从相邻组采样的两个文档,该模型可以逐渐提高其捕获匹配信号的能力。我们提出了一个渐进的超链接预测({php})框架,以探索预训练中超链接的利用。对两个大规模临时检索数据集和六个提问数据集的实验结果证明了其优于现有的预训练方法。
translated by 谷歌翻译
广义文本表示是许多自然语言理解任务的基础。要充分利用不同的语料库,不可避免地需要了解它们之间的相关性。但是,许多方法忽略了相关性,并直接用于所有任务的单通道模型(粗糙的范式),这缺乏足够的理性和解释。此外,一些现有的作品通过针迹技能块(一个精细的范式)学习下游任务,这可能会导致其冗余和噪音,从而导致非理性。在这项工作中,我们首先通过三种不同的观点分析任务相关性,即数据属性,手动设计和基于模型的相关性,基于相似的任务被分组在一起。然后,我们提出了一个用粗到细范式的层次结构框架,其最底层共享了所有任务,中层级别分为不同的组,以及分配给每个任务的顶级级别。这使我们的模型可以从所有任务中学习基本的语言属性,提高相关任务的性能,并减少不相关任务的负面影响。我们在五个自然语言理解任务的13个基准数据集上进行的实验证明了我们方法的优势。
translated by 谷歌翻译
主要跟踪器基于先前的预测或初始边界框作为模型输入(即搜索区域)生成固定尺寸的矩形区域。尽管这种方式导致了提高的跟踪效率,但固定尺寸的搜索区域缺乏灵活性,并且在情况下可能会失败,例如快速运动和干扰物干扰。由于搜索区域有限,跟踪器往往会丢失目标对象,或者由于搜索区域过多而受到干扰因素的干扰。在这项工作中,我们提出了一个新颖的跟踪范式,称为搜索区域调节跟踪(SRRT),该范式应用了建议的搜索区域调节器,以动态地估算每个帧的最佳搜索区域。为了调整对象在跟踪过程中的外观变化,我们进一步提出了锁定状态确定的更新策略以进行参考框架更新。我们的SRRT框架在没有精美设计的情况下非常简洁,但在七个具有挑战性的基准方面,与其他最先进的跟踪器有关基线的改进和竞争成果明显。在大规模的Lasot基准测试中,我们的SRRT改善了siamrpn ++和Transt,其绝对增长为4.6%和3.1%。
translated by 谷歌翻译
Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion manner to consider the similarity between the template and the search re-
translated by 谷歌翻译
电磁源成像(ESI)需要解决高度不良的反问题。为了寻求独特的解决方案,传统的ESI方法施加了各种形式的先验,这些方法可能无法准确反映实际的源属性,这可能会阻碍其广泛的应用。为了克服这一局限性,在本文中,提出了一种新的数据合成的时空卷积编码器网络方法,称为dst-cednet。 DST-CEDNET将ESI作为机器学习问题重新铸造,其中歧视性学习和潜在空间表示形式集成到卷积编码器decoder网络(CEDNET)中,以从测量的电脑摄影/磁脑摄影学(E/MEG)信号中学习强大的映射,大脑活动。特别是,通过纳入有关动态大脑活动的先验知识,设计了一种新型的数据合成策略来生成大规模样本,以有效训练Cednet。这与传统的ESI方法相反,在传统的ESI方法中,通常通过主要旨在用于数学便利的约束来实施先前的信息。广泛的数值实验以及对真实MEG和癫痫脑电图数据集的分析表明,DST-Cednet在多种源配置下稳健估计源信号的多种最新ESI方法的表现。
translated by 谷歌翻译
Large pretrained language models can easily produce toxic or biased content, which is prohibitive for practical use. In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations. However, what type of context is more likely to induce unsafe responses is still under-explored. In this paper, we identify that context toxicity and context category (e.g., \textit{profanity}, \textit{insult}, \textit{drugs}, etc.) are two important factors to cause safety issues in response generation. Hence, we propose a method called \emph{reverse generation} to construct adversarial contexts conditioned on a given response, with the flexibility to control category, toxicity level, and inductivity of the generated contexts. Via reverse generation, we augment the existing BAD dataset and construct a new dataset BAD+ which contains more than 120K diverse and highly inductive contexts in 12 categories. We test three popular pretrained dialogue models (Blender, DialoGPT, and Plato2) and find that BAD+ can largely expose their safety problems. Furthermore, we show that BAD+ can greatly enhance the safety of generation and reveal the key factors of safety improvement. Our code and dataset is available at \url{https://github.com/thu-coai/Reverse_Generation}.
translated by 谷歌翻译
With the drive to create a decentralized digital economy, Web 3.0 has become a cornerstone of digital transformation, developed on the basis of computing-force networking, distributed data storage, and blockchain. With the rapid realization of quantum devices, Web 3.0 is being developed in parallel with the deployment of quantum cloud computing and quantum Internet. In this regard, quantum computing first disrupts the original cryptographic systems that protect data security while reshaping modern cryptography with the advantages of quantum computing and communication. Therefore, in this paper, we introduce a quantum blockchain-driven Web 3.0 framework that provides information-theoretic security for decentralized data transferring and payment transactions. First, we present the framework of quantum blockchain-driven Web 3.0 with future-proof security during the transmission of data and transaction information. Next, we discuss the potential applications and challenges of implementing quantum blockchain in Web 3.0. Finally, we describe a use case for quantum non-fungible tokens (NFTs) and propose a quantum deep learning-based optimal auction for NFT trading to maximize the achievable revenue for sufficient liquidity in Web 3.0. In this way, the proposed framework can achieve proven security and sustainability for the next-generation decentralized digital society.
translated by 谷歌翻译
作为一个与现实世界互动的虚拟世界,元媒体封装了我们对下一代互联网的期望,同时带来了新的关键绩效指标(KPIS)。常规的超级可靠和低延迟通信(URLLC)可以满足绝大多数客观服务KPI,但是很难为用户提供个性化的荟萃服务体验。由于提高经验质量(QOE)可以被视为当务之急的KPI,因此URLLC朝向下一代URLLC(XURLLC),以支持基于图形技术的荟萃分析。通过将更多资源分配给用户更感兴趣的虚拟对象,可以实现更高的QoE。在本文中,我们研究了元服务提供商(MSP)和网络基础架构提供商(INP)之间的相互作用,以部署Metaverse Xurllc服务。提供了最佳合同设计框架。具体而言,将最大化的MSP的实用程序定义为元用户的QOE的函数,同时确保INP的激励措施。为了建模Metaverse Xurllc服务的Qoe,我们提出了一个名为Meta Immersion的新颖指标,该指标既包含了客观网络KPI和元用户的主观感觉。使用用户对象注意级别(UOAL)数据集,我们开发并验证了注意力吸引人的渲染能力分配方案以改善QOE。结果表明,与常规的URLLC相比,Xurllc平均提高了20.1%的QoE改善。当总资源有限时,QoE改进的比例较高,例如40%。
translated by 谷歌翻译
在支持计算和通信技术的支持下,元评估有望为用户带来前所未有的服务体验。但是,元用户数量的增加对网络资源的需求量很大,尤其是用于基于图形扩展现实并需要渲染大量虚拟对象的荟萃分析服务。为了有效利用网络资源并改善体验质量(QOE),我们设计了一个注意力吸引网络资源分配方案,以实现定制的元评估服务。目的是将更多的网络资源分配给用户更感兴趣的虚拟对象。我们首先讨论与荟萃服务有关的几种关键技术,包括QOE分析,眼睛跟踪和远程渲染。然后,我们查看现有的数据集,并提出用户对象注意级别(UOAL)数据集,该数据集包含30个用户对1,000张图像中96个对象的地面意义。提供有关如何使用UOAL的教程。在UOAL的帮助下,我们提出了一种注意力感知的网络资源分配算法,该算法有两个步骤,即注意力预测和QOE最大化。特别是,我们概述了两种类型的注意力预测方法的设计,即兴趣感知和时间感知预测。通过使用预测的用户对象 - 注意值,可以最佳分配边缘设备的渲染能力等网络资源以最大化QOE。最后,我们提出了与荟萃服务有关的有前途的研究指示。
translated by 谷歌翻译